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A  discussion  of  'Associations  between  the  I  l-ycar  solar  cycle,  the  QBO  and  the  atmosphere.  Pari  1’ 
[Labitzke  K.  and  van  Loon  H.  (1988),  J.  aimos.  lerr.  Phys.  50,  197], 


Labitzke  and  van  Loon  (1988),  in  the  above  paper, 
have  published  data  that  arc  striking  and  suggestive. 
Their  work  has  received  much  attention  and  has 
inspired  further  research.  The  purpose  of  the  present 
note  is  to  reconsider  the  question  of  whether  their 
results  could  be  accounted  for  by  chance  alone.  Our 
discussion  which  is  intended  to  be  constructive  will  be 
mainly  limited  to  their  paper. 

The  problem  can  be  divided  into  three  sub-ques¬ 
tions.  (a)  What  is  the  probability  that  a  particular 
‘QBO  transformation’  of  the  stratospheric  North  Pole 
temperature  time-series  reveals  an  ll-yr  solar  cycle? 
(b)  What  is  the  probability  that  three  cycles  of  a  given 
geophysical  variable  would  exhibit  the  same  period  as 
the  solar  cycle?  And  finally  (c),  what  is  the  "statistical 
significance’  of  the  0.76  correlation  found  by  them  in 
view  of  the  fact  that  the  samples  of  the  two  time-series 
(temperature  and  solar-tlux)  are  very  sinusoidal  in 
shape?  Most  of  our  discussion  is  in  regard  to  this  last 
question  because  we  wish  to  show  that  the  statistical 
test  employed  by  Labitzke  and  van  Loon  is  incon¬ 
clusive.  Thus  the  main  point  of  our  paper  will  not  be 
merely  to  urge  caution.  Instead  our  main  point  will 
be  to  show  that,  in  reality,  Labitzke  and  van  Loon  did 
not  do  anything  which  validly  establishes  statistical 
significance. 

The  first  question,  (a),  raises  the  issue  ‘how  many 
other  transformations  were  tried  first’?  It  would  make 
a  difference  if  the  QBO  filler  were  the  very  first  one 
tried  or  just  the  last  of  very  many.  It  will  certainly  be 
more  impressive  if  their  transformation  continues  to 
succeed  equally  well  for  the  next  several  cycles  without 
any  subsequent  ’rule  changes’.  This  issue,  among 
others,  has  recently  received  attention  by  Baldwin  and 
Dunkerton  (1989)  and  we  recommend  their  paper 
to  the  interested  reader.  Briefly,  we  do  not  see  how  a 
rigorous  test  can  be  made  except  by  now  repeating  ex¬ 
actly  what  they  have  done  to  the  next  several  cycles. 


With  only  32  data  points,  any  a  posteriori  associ¬ 
ation  must  be  considered  as  tentative.  The  fact  that 
the  association  was  not  predicted  before  examining 
the  data  makes  it  impossible  to  assign  any  realistic 
significance  level  to  the  association.  The  tacit  assump¬ 
tion  in  the  Monte-Carlo  lest  is  that  *.v*  successes 
obtainable  from  'y  independent  trials  (i.c.  particular 
transformations  applied)  determines  the  chance  prob¬ 
ability  of  a  success.  With  a  small  and  well  studied 
sample  of  data,  the  fact  that  the  association  is  a  pos¬ 
teriori  implies  of  necessity  that  y  in  this  case  docs  not 
equal  unity.  A  referee  has  written  that  ‘certainly  there 
may  have  been  subconscious  filtering  prior  to  actually 
writing  out  the  scries — no  one  can  really  say’.  This  is 
our  point.  What  value  can  one  assign  to  'y'1  Is  it  3, 
10,  or  significantly  more  than  10?  Hadamard’s  book 
‘The  Psychology  of  Invention  in  the  Mathematical 
Field'  (Hadamard,  1945)  suggests  the  last  possibility ; 
but,  there  is  no  actual  way  to  determine  the  answer. 
With  this  in  mind  then,  we  arc  faced  with  the  fact  that 
the  levels  of  significance  arc  based  on  the  assumption 
that  'x'  and  ’y'  equal  unity.  We  do  not  claim  to  have 
the  value  to  assign  to  What  we  claim  is  that 
Labitzke  and  van  Loon  are  also  unable  to  justify  a 
value  for  'y'  and,  in  particular,  cannot  assume  that 
V  =  unity.  Thus  we  must  urge  caution  in  accepting 
the  association  at  face  value. 

Another  point  to  make  is  that  even  a  totally  valid 
Monte-Carlo  experiment  of  this  type  would  not 
exclusively  indicate  that  the  cause  must  be  of  solar 
origin.  An  alternate  physical  possibility  that  cannot 
be  ruled  out  is,  for  example,  that  the  nonlinear  dynam¬ 
ics  of  the  Earth’s  atmosphere  have  caused  a  long- 
period,  self-sustained  oscillation  of  approximately  the 
same  length  as  that  of  the  Sun’s  cycle.  Later  on  we 
shall  mention  yet  another  possible  interpretation  due 
to  Teiteliiaum  and  Bauer  (1990), 

The  second  question  is:  ‘What  is  the  probability 
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that  a  geophysical  variable  exhibits  a  ‘solar  cycle’  per¬ 
iod'.”  Tiuihl  (1987)  devotes  part  of  a  chapter  of  his 
book  to  this  question.  He  listed  a  number  of  past 
observations  of  correlations  between  the  solar  cycle 
and  geophysical  variables  ranging  from  air  pressure 
in  India  (low  at  sunspot  maximum)  to  the  depth  of 
Lake  Victoria  in  Africa.  Of  these  several  examples, 
all  have  subsequently  failed.  Trclil  went  on  to  show 
that  the  lengths  of  women’s  skirts  (relative  to  their 
height)  varied  between  1926  and  1980  with  a  solar 
cycle  period,  [van  Loon  (piivate  conversation)  has 
shown  one  of  us  (E.  D.)  an  amazingly  high  correlation 
between  sunspot  numbers  and  the  number  of  Repub¬ 
licans  in  the  U.S.  Senate  between  1965  and  1985.] 
Trcfil  concluded  that  in  view  of  the  large  number  of 
geophysical  fluctuations  which  one  can  try  to  correlate 
with  sunspots  it  is  more  or  less  guaranteed  that  one  is 
bound  to  find  that  some  of  them  do  correlate.  The 
reader  may  find  it  useful  to  consult  his  book  for  his 
numerical  arguments;  however,  he  concludes  that  he 
would  not  take  seriously  any  correlation  with  less  than 
eight  cycles. 

We  now  arrive  at  the  third  question,  i.e.  of  the 
assignments  of  a  level  of  ‘statistical  significance’  to 
the  correlation  coefficient  (of  value  0.76)  found  by 
Labitzkc  and  van  Loon.  They  used  the  following  pro¬ 
cedure.  First  they  quoted  Panoksky  and  Brier  (1963). 
They  gave  as  a  test  for  significance  at  the  95%  level 

196  m 

r,„  =  ------  (I) 

where  r.,f  is  the  correlation  coefficient  lower  bound 
for  the  95%  level  of  significance  and  n,  is  the  cJfectivc 
number  of  independent  samples.  The  latter  number 
can  be  derived  via  a  procedure  which  was  first  dis¬ 
cussed  by  Davis  (1976).  But  first  it  should  be  noted 
that  the  idea  is  to  take  into  account  the  non-zero  auto¬ 
correlations  in  the  two  series  here  considered  where 
such  a  need  is  pointed  out,  for  example,  in  Jenkins 
and  Wa  its  (1968).  Historically  this  problem  was  first 
pointed  out  by  Yule  (1926).  He  showed  that  the 
ordinary  tests  for  the  statistical  significance  of  cor¬ 
relation  coefficients  assume  that  the  two  datasets 
being  compared  resemble  purely  random  noise.  In 
other  words,  each  successive  point  is  independent  of 
preceding  points.  For  example,  sine  waves  and  other 
smooth  functions  violate  this  requirement. 

The  procedure  of  Davis  (1976)  consists  of  the  fol¬ 
lowing.  First  one  calculates  the  normalized  auto-cor¬ 
relations  of  each  of  the  two  scries 

„  ,  .  _  f<7Xr)T(r+T))d/ 

,(T)  J  <7’3(0> 


Cs(T)  - 


<-V(/)5‘(/  +  t)>  dr 
<-V’(/)> 


where  T(i)  and  S(t)  correspond  to  the  temperature 
and  solar  Dux  cycles  and  the  brackets  indicate 
averages  Since  these  are  sampled  series,  one  defines 
At  as  the  sampling  interval.  He  then  calculated 

jt 

t,  s  J]  C,(/A/)C'v(/A/)A/.  |2) 

f  -  t. 

From  this  one  can  define 


where  N  is  the  total  number  of  points. 

Next,  after  doing  this,  Labitzkc  and  van  Loon  apply 
the  above  to  their  two  series  in  their  lig.  la.  [Solar 
10.7-cm  tlux  and  30mbar  North  Pole  [(January  and 
Fcbruary)/2]  temperature  (Ar  =  32,  r  =  (0.14)).] 

They  find  that  N  must  be  reduced  from  32  to  23. 
The  resulting  rvi  =  0.43,  and  since  the  measured  value 
is  0.14,  they  say,  correctly,  that  there  is  no  statistical 
significance.  After  treating  other  questions  they  again 
return  to  mattcis  of  correlation  and  significance  i.e. 
they  consider  lig.  lb  which  is  the  ‘QBO’  transformed 
version  of  their  temperature  data.  Here  we  see  two 
nearly,  perfectly  in-phase,  sinusoidal  oscillations,  one 
superposed  on  the  other,  with  a  duration  of  about 
three  periods  (1956-1987).  The  correlation  coefficient 
here  is  r  —  0.76;  and,  at  this  point  more  than  ever, 
one  needs  to  take  into  account  the  effects  of  the  auto¬ 
correlations.  Labitzke  and  van  Loon  then  state:  ‘If 
we  assume  that  each  of  the  winter  temperatures  are 
independent  (i.e./i,  =  17)  this  correlation  is  significant 
above  the  99%  level.  Even  if  one  reduces  the  number 
of  independent  samples  by  50%  (i.e.  «,  =  9)  it  is  sig¬ 
nificant  at  the  95%  level.’  Unfortunately,  as  will  he 
discussed  below,  the  number  of  degrees  in  the  data 
under  consideration  is  close  to  four.  In  any  case  it 
should  be  noted  that  they  gave  as  a  reason  why  they 
did  not  calculate  from  their  data  the  following 
statement :  ‘because  the  sampling  interval  varies’.  Per¬ 
haps  a  better  reason  is  that,  in  the  use  of  (2)  to  cal¬ 
culate  a,.,  the  probable  errors  in  C,  and  Cs  arc  too 
large  (CT  =  0.4  cannot  be  distinguished  from  zero  for 
N  =  17);  this  worsens  for  each  succeeding  value  of 
the  auto-correlations. 

With  each  succeeding  lag  of  the  auto-correlation 
function  the  sample  size  decreases  and  the  error 
bounds  increase.  Therefore,  little  confidence  can  be 
placed  on  any  estimate  of  r,  from  equation  (2)  if  the 
total  sample  size  ( N )  is  only  17,  However,  there  is  a 
more  serious  problem  involving  the  lack  of  station- 
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ariiy  of  the  time  series  from  which  the  auto-correlation 
function  is  obtained. 

With  cacii  succeeding  value  of  the  auto-correlation 
function,  the  sample  size  decreases  by  I.  Conse¬ 
quently.  with  a  small  basic  sample  (17  in  this  case), 
end  effects  become  important.  That  is,  the  sample  of 
16  from  which  lag  1  is  obtained  is  of  necessity  different 
from  and  inconsistent  with  the  sample  of  12  from 
which  lag  5  is  obtained,  and  each  of  these  samples  arc 
in  turn  inconsistent  with  the  samples  from  which  the 
other  lag  correlations  are  obtained.  The  inhomo- 
geneity  among  the  auto-correlation  coefficients  is 
exacerbated  with  increasing  lag  and  makes  it  impos¬ 
sible  to  obtain  a  valid  estimate  of 

It  is  nevertheless  appropriate  here  to  discuss  further 
the  estimated  number  of  degrees  of  freedom  in  the 
solar  activity  and  temperature  sinusoidal  time-series 
that  was  mentioned  earlier.  If  these  scries  were  in 
fact  two  perfect  sine  waves,  then  there  would  be  four 
degrees  of  freedom  in  view  of  the  fact  that  only  the 
phase  and  period  of  each  sine  wave  matters  in  the 
present  context.  In  other  words,  since  each  sine  wave 
has  two  degrees  of  freedom,  the  total  must  be  four. 
Of  course  the  actual  time-series  are  both  not  perfect 
sine  waves  and  this  then  raises  the  question  of  whether 
or  not  there  tire  more  degrees  of  freedom  due  to  this 
fact.  In  order  to  answer  this  question  the  reader  should 
imagine  what  would  happen  to  the  two  time-series  if 
a  least-squares  fitted  sine  wave  were  subtracted  out  of 
each  of  them.  Would  there  remain  any  compelling 
correlation  after  this  procedure  is  carried  out?  It 
appears  to  be  obvious  that  the  answer  is  ‘no’.  To  the 
extent  that  this  ’no’  answer  is  correct,  the  estimate  of 
four  degrees  of  freedom  is  also  correct ;  but,  a  skeptical 
reader  who  needs  further  proof  about  this  may  wish 
to  consult  Pierce  (1977)  for  more  discussion  on  this 
approach.  If  one  then  assumes  that  the  correlation, 
r  —  0.76,  and  the  number  of  degrees  of  freedom, 
d.f.  =  4,  then  the  value  of  V  in  the  '/-test’  for  level  of 
significance  of  r  (Moroni-y,  1951),  is  given  by 

/  =  (rJ&S.lJf-?)  =  2.34. 

But  in  order  to  obtain  a  5%  level  of  significance,  /  must 
be  at  least  as  large  as  2.78.  Therefore  the  correlation  is 
not  significant. 

It  should  now  thus  be  clear  that  Labitzke  and  van 
Loon  have  not  yet  shown  valid  quantitative  evidence 
for  a  statistically  significant  solar-weather  corre¬ 
lation.  Is  there  another,  more  valid  test  in  this  regard 
for  one  to  try?  The  answer  is  ’perhaps’  and  it  is 
described  in  Gottman  (1981). 

In  essence  this  later  technique  (called  the  Gottman- 
Ringland  procedure)  compares  two  time-series 


models.  Let  S,  and  T,  represent  the  solar  flux  and 
North  Pole  temperatures  as  a  function  of  lime  /.  The 
two  models  are 

H 

T<  =  Z  ^Tt-i+e,  (3) 

<  -  i 

and 

T,  *  £  b.T,  .+e,+  V  D,S,  , 

I  -  I  i  I 

S,  =  V  +  (4) 

1  *  I 

where  n,  and  e,  represent  random  noise  and  the  term 
in  the  box  includes  a  casual  influence.  Significance  is 
determined  by  means  of  a  likelihood  ratio  test  to  sec 
if  model  (4)  does  a  significantly  better  job  of  pre¬ 
diction  than  (3). 

This  test  has  not  yet  been  applied  to  the  data  of  fig. 
Ib  of  Labitzke  and  van  Loon.  However,  and  ihis  is 
most  unfortunate,  the  Gotlman-Ringland  procedure 
again  must  rely  on  auto-correlations  which,  as  we 
already  pointed  out  above,  cannot  be  calculated  with 
the  requisite  confidence  when  one  has  merely  1 7  points 
at  one’s  disposal. 

The  reader  may  wonder  why  we  mention  this  pro¬ 
cedure  in  view  of  the  fact  that  it  is  not  valid  to  apply 
it  to  so  few  data.  Our  purpose  has  been  to  show’ 
what  would  actually  be  needed  to  provide  statistical 
evidence  of  the  cause  and  effect  relationship  that  has 
been  claimed. 

The  issue  of  the  effective  sample  size  (or  the  equi¬ 
valent  number  of  independent  data  points)  in  data 
samples  with  auto-correlation  in  time  (or  space)  has 
been  extensively  examined  by  Tiiii-raux  and  Zwieks 
(1984).  Using  a  Monte-Carlo  approach  with  sample 
sizes  ranging  from  30  to  240,  with  two  different  data 
generating  models  and  a  number  of  values  of  the 
lag  I  auto-correlation,  they  compare  seven  different 
methods  of  estimating  effective  sample  size  (ESS). 
They  conclude  that  ESS  is  difficult  to  estimate  reliably 
and  that,  without  knowledge  of  the  power  spectrum 
of  the  observed  process,  ESS  cannot  be  estimated 
reliably  without  a  large  data  sample.  Even  with  the 
largest  data  samples  ( N  —  240)  there  is  considerable 
variability  in  the  estimates  of  ESS  among  the  seven 
methods,  although  one  or  two  methods  succeed  in 
approximating  the  ‘true’  (model  generated)  values  of 
ESS.  However,  with  (V  =  30  (almost  twice  the  size 
of  the  Labitzke  and  van  Loon  sample)  there  is  no 
confidence  in  any  of  the  estimates  of  ESS. 

It  should  be  clear  now  that,  while  the  data  of 
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Labitzke  and  van  Loon  arc  very  interesting,  sugges¬ 
tive,  and  even  dramatic,  these  authors  have  not  shown 
that  the  relationship  could  not  have  occurred  by 
chance  alone,  notwithstanding  the  authors’  positive 
Monte-Carlo  experiment. 

Barnston  and  Livezey  (1989)  have  addressed  this 
question  by  carrying  out  a  very  extensive  series  of 
Monte-Carlo  experiments  with  essentially  the  same 
data  examined  by  Labitzke  and  van  Loon.  They  urge 
’caution  and  reserve’  in  accepting  the  reality  of  the 
relationship  in  spite  of  a  number  of  apparently  stat¬ 
istically  significant  relationships  between  the  solar  flux 
and  the  QB0  tillered  large-scale  upper  air  pressure 
and  surface  temperature  fields. 

A  recent  and  very  thought  provoking  paper  has  been 
published  by  "Hu  rtutAi.'M  .nd  Bauer  (1990)  which 
suggests  a  possible  explanation  of  the  observations  of 
Lamtzkh  and  van  Loon  (1988).  They  showed  that 
the  application  of  the  QBO  transformation  to  the  data 
could  itself  be  the  cause  of  the  solar-cycle  period  that 
was  observed  in  the  temperature  data.  They  reason 
that  the  QBO  possibly  modulates  the  stratospheric 
temperature  so  that  the  latter  has  a  cycle  of  about 
27.7  months.  The  sampling  procedures  involved, 
however,  lead  to  an  approximately  24-months  sam¬ 
pling  interval.  Thus  there  would  be  a  phase  shift 


observed  of  3.7  months  for  each  data  poin'  leading  to 
a  complete  cycle  of  shift  in  about  I  j  y.  This  ‘strobo¬ 
scopic’  or  ‘alias’  effect  would  generate  a  cycle  with 
period  indistinguishable  from  a  solar  period  in  the 
sample  length  of  32  yr  presently  available.  Further 
data  may  either  confirm  or  invalidate  this  strobo¬ 
scopic  ellect.  We  feel  that  this  possibility  can  only 
further  emphasize  the  present  need  for  caution. 

In  conclusion,  we  feel  that  it  is  not  possible  to 
demonstrate  reality  with  the  available,  small,  non- 
independent  data  samples.  1  n  this  connection  we  think 
that  further  attempts  at  establishing  statistical  sig¬ 
nificance  of  a  result  based  upon  such  a  small  sample 
would  be  unproductive.  We  believe  that  a  dem¬ 
onstration  (cr  reality)  must  be  based  on  independent 
data  in  which  the  expected  relationship  is  maintained, 
or  on  a  solid  physical  theory  which  itself  predicts 
the  postulated  relationships.  On  the  other  hand,  it  is 
hoped  that  our  remarks  will  bring  about  a  further 
healthy  debate  and  hence  more  clarification  on  this 
highly  interesting  subject. 
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